Summary
In this notebook we will open a netCDF4 file from the Earth Surface Minteral Dust Source Investigation (EMIT), specifically the Level 2A (L2A) Reflectance product. We will inspect the structure and plot the spectra of individual pixels and spatial coverage of a single scene. After that we will take advantage of the holoviews streams to build an interactive plot.
Background
The EMIT instrument is an imaging spectrometer that measures light in visible and infrared wavelengths. These measurements display unique spectral signatures that correspond to the composition on the Earth's surface. The EMIT mission focuses specifically on mapping the composition of minerals to better understand the effects of mineral dust throughout the Earth system and human populations now and in the future. More details about EMIT and its associated products can be found in the README.md and on the EMIT website.
The L2A Reflectance Product contains estimated surface reflectance. Surface reflectance is the fraction of incoming solar radiation reflected Earth's surface. Different materials reflect different proportions of radiation based upon their chemical composition, meaning that this information can be used to determine the composition of a target. In this guide you will learn how to plot a layer from the L2A reflectance spatially and look at the spectral curve associated with individual pixels, which can be used to identify targets.
Requirements
/setup/ folder /setup/ folder Learning Objectives
.nc file as an xarray.DatasetTutorial Outline
1.1 Setup
1.2 Opening The Data
1.3 Plotting Data - Non-Orthorectified
1.4 Orthorectification
1.5 Plotting Data - Orthorectified
1.6 Saving Orthorectified Data
1.7 Interactive Spatial and Spectral Plots
from osgeo import gdal
import numpy as np
import math
import xarray as xr
import geoviews as gv
import holoviews as hv
import hvplot.xarray
import netCDF4 as nc
Download the L2A Reflectance EMIT scene located here to your /data/ folder, then define an object representing the file path, like below.
fp = '../data/EMIT_L2A_RFL_001_20220903T163129_2224611_012.nc'
EMIT Data is distributed in a hierarchical NetCDF4 (.nc) format. Inside the netCDF4 file there are 3 groups, the root group containing reflectance values accross the downtrack, crosstrack, and bands dimensions, the sensor_band_parameters group containing the wavelength of each band center, and the full-width half maximum (FWHM) or bandwidth at half of the maximum amplitude, and the location group containing latitude and longitude values of each pixel as well as a geometric lookup table (GLT). The GLT is an orthorectified image that provides relative downtrack and crosstrack reference locations from the raw scene to facilitate fast projection of the dataset.
To access the .nc file we will use the xarray library. xarray only support non-hierarchical (flat) datasets, meaning that when loading a NetCDF into an xarray.Dataset, by default only the root group is added, the others have to be manually added. Since xarray does not recognize these other groups, the keys cannot be listed using that library. If we need to check what keys are present we can use the netcdf4 library. In the case of EMIT data, the reflectance group is the root group, and contains reflectance values and some metadata. As mentioned above, the other groups are sensor_band_parameters and location. Assuming we don't know the groups present, lets first explore the heirarchical structure of the EMIT data using the netcdf4 library.
ds = nc.Dataset(fp)
ds
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
ncei_template_version: NCEI_NetCDF_Swath_Template_v2.0
summary: The Earth Surface Mineral Dust Source Investigation (EMIT) is an Earth Ventures-Instrument (EVI-4) Mission that maps the surface mineralogy of arid dust source regions via imaging spectroscopy in the visible and short-wave infrared (VSWIR). Installed on the International Space Station (ISS), the EMIT instrument is a Dyson imaging spectrometer that uses contiguous spectroscopic measurements from 410 to 2450 nm to resolve absoprtion features of iron oxides, clays, sulfates, carbonates, and other dust-forming minerals. During its one-year mission, EMIT will observe the sunlit Earth's dust source regions that occur within +/-52° latitude and produce maps of the source regions that can be used to improve forecasts of the role of mineral dust in the radiative forcing (warming or cooling) of the atmosphere.\n\nThis file contains L2A estimated surface reflectances and geolocation data. Reflectance estimates are created using an Optimal Estimation technique - see ATBD for details. Reflectance values are reported as fractions (relative to 1).
keywords: Imaging Spectroscopy, minerals, EMIT, dust, radiative forcing
Conventions: CF-1.63
sensor: EMIT (Earth Surface Mineral Dust Source Investigation)
instrument: EMIT
platform: ISS
processing_version: V1.0
institution: NASA Jet Propulsion Laboratory/California Institute of Technology
license: https://science.nasa.gov/earth-science/earth-science-data/data-information-policy/
naming_authority: LPDAAC
date_created: 2022-11-14T09:50:54Z
keywords_vocabulary: NASA Global Change Master Directory (GCMD) Science Keywords
stdname_vocabulary: NetCDF Climate and Forecast (CF) Metadata Convention
creator_name: Jet Propulsion Laboratory/California Institute of Technology
creator_email: sarah.r.lundeen@jpl.nasa.gov
creator_url: https://earth.jpl.nasa.gov/emit/
project: Earth Surface Mineral Dust Source Investigation
project_url: https://emit.jpl.nasa.gov/
publisher_name: USGS LPDAAC
publisher_url: https://lpdaac.usgs.gov
publisher_email: lpdaac@usgs.gov
identifier_product_doi_authority: https://doi.org
flight_line: emit20220903t163129_o24611_s000
time_coverage_start: 2022-09-03T16:31:29+0000
time_coverage_end: 2022-09-03T16:31:41+0000
software_build_version: 010603
product_version: 01
history: PGE Input files: radiance_file=/beegfs/store/emit/ops/data/acquisitions/20220903/emit20220903t163129/l1b/emit20220903t163129_o24611_s000_l1b_rdn_b0106_v01.img, pixel_locations_file=/beegfs/store/emit/ops/data/acquisitions/20220903/emit20220903t163129/l1b/emit20220903t163129_o24611_s000_l1b_loc_b0106_v01.img, observation_parameters_file=/beegfs/store/emit/ops/data/acquisitions/20220903/emit20220903t163129/l1b/emit20220903t163129_o24611_s000_l1b_obs_b0106_v01.img, surface_model_config=/beegfs/store/emit/ops/repos/emit-sds-l2a/surface/surface_20221020.json
easternmost_longitude: -62.5120945327963
northernmost_latitude: -39.3067591475017
westernmost_longitude: -61.236221412633064
southernmost_latitude: -40.39610428069674
spatialResolution: 0.000542232520256367
spatial_ref: GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
geotransform: [-6.25120945e+01 5.42232520e-04 -0.00000000e+00 -3.93067591e+01
-0.00000000e+00 -5.42232520e-04]
day_night_flag: Day
title: EMIT L2A Surface Reflectance 60 m V001
dimensions(sizes): downtrack(1280), crosstrack(1242), bands(285), ortho_y(2009), ortho_x(2353)
variables(dimensions): float32 reflectance(downtrack, crosstrack, bands)
groups: sensor_band_parameters, location
When inspecting the ds object, we can see the metadata, dimensions, variables, and groups. To specifically view the groups, append .groups.keys() to the object.
ds.groups.keys()
dict_keys(['sensor_band_parameters', 'location'])
Now read the root reflectance group as an xarray.Dataset and preview it.
refl = xr.open_dataset(fp)
refl
<xarray.Dataset>
Dimensions: (downtrack: 1280, crosstrack: 1242, bands: 285)
Dimensions without coordinates: downtrack, crosstrack, bands
Data variables:
reflectance (downtrack, crosstrack, bands) float32 ...
Attributes: (12/38)
ncei_template_version: NCEI_NetCDF_Swath_Template_v2.0
summary: The Earth Surface Mineral Dust Source ...
keywords: Imaging Spectroscopy, minerals, EMIT, ...
Conventions: CF-1.63
sensor: EMIT (Earth Surface Mineral Dust Sourc...
instrument: EMIT
... ...
southernmost_latitude: -40.39610428069674
spatialResolution: 0.000542232520256367
spatial_ref: GEOGCS["WGS 84",DATUM["WGS_1984",SPHER...
geotransform: [-6.25120945e+01 5.42232520e-04 -0.00...
day_night_flag: Day
title: EMIT L2A Surface Reflectance 60 m V001We can see that the information read in only contains the root variable (reflectance) and attributes metadata, not those from the groups we previously listed. Using those group names we can read the other groups into their own xarray dataset object.
Read in the sensor_band_parameters group as an xarray dataset and preview it.
wvl = xr.open_dataset(fp,group='sensor_band_parameters')
wvl
<xarray.Dataset>
Dimensions: (bands: 285)
Dimensions without coordinates: bands
Data variables:
wavelengths (bands) float32 381.0 388.4 395.8 ... 2.486e+03 2.493e+03
fwhm (bands) float32 8.415 8.415 8.415 8.415 ... 8.806 8.807 8.809Now we can merge the two xarray datasets into a single dataset.
ds = xr.merge([refl,wvl])
ds
<xarray.Dataset>
Dimensions: (downtrack: 1280, crosstrack: 1242, bands: 285)
Dimensions without coordinates: downtrack, crosstrack, bands
Data variables:
reflectance (downtrack, crosstrack, bands) float32 ...
wavelengths (bands) float32 381.0 388.4 395.8 ... 2.486e+03 2.493e+03
fwhm (bands) float32 8.415 8.415 8.415 8.415 ... 8.806 8.807 8.809
Attributes: (12/38)
ncei_template_version: NCEI_NetCDF_Swath_Template_v2.0
summary: The Earth Surface Mineral Dust Source ...
keywords: Imaging Spectroscopy, minerals, EMIT, ...
Conventions: CF-1.63
sensor: EMIT (Earth Surface Mineral Dust Sourc...
instrument: EMIT
... ...
southernmost_latitude: -40.39610428069674
spatialResolution: 0.000542232520256367
spatial_ref: GEOGCS["WGS 84",DATUM["WGS_1984",SPHER...
geotransform: [-6.25120945e+01 5.42232520e-04 -0.00...
day_night_flag: Day
title: EMIT L2A Surface Reflectance 60 m V001Pick a random downtrack and crosstrack location. Here we chose 660, 370 (downtrack,crosstrack). Next use the isel() function from xarray and the hvplot.line() functions to first select the spatial position and then plot a line showing the reflectance at that location.
ds['reflectance'].isel(downtrack=660,crosstrack=370).hvplot.line(y='reflectance',x='bands', color='black')
We can see some flat regions in the spectral curve around bands 127 - 141 and 187 - 212. These are where water absoption features in these regions were removed. Typically this data is noisy due to the moisture present in the atmosphere; therefore, these spectral regions offer little information about targets and can be excluded from calculations.
Although they have been reassigned a value of -0.01, we can mask them to improve visualization, by using the where() function to select regions of the dataset where the reflectance value is not equal to -0.01.
ds['reflectance'] = ds['reflectance'].where(ds['reflectance']!=-0.01)
ds
<xarray.Dataset>
Dimensions: (downtrack: 1280, crosstrack: 1242, bands: 285)
Dimensions without coordinates: downtrack, crosstrack, bands
Data variables:
reflectance (downtrack, crosstrack, bands) float32 0.02187 ... -0.004341
wavelengths (bands) float32 381.0 388.4 395.8 ... 2.486e+03 2.493e+03
fwhm (bands) float32 8.415 8.415 8.415 8.415 ... 8.806 8.807 8.809
Attributes: (12/38)
ncei_template_version: NCEI_NetCDF_Swath_Template_v2.0
summary: The Earth Surface Mineral Dust Source ...
keywords: Imaging Spectroscopy, minerals, EMIT, ...
Conventions: CF-1.63
sensor: EMIT (Earth Surface Mineral Dust Sourc...
instrument: EMIT
... ...
southernmost_latitude: -40.39610428069674
spatialResolution: 0.000542232520256367
spatial_ref: GEOGCS["WGS 84",DATUM["WGS_1984",SPHER...
geotransform: [-6.25120945e+01 5.42232520e-04 -0.00...
day_night_flag: Day
title: EMIT L2A Surface Reflectance 60 m V001Since these datasets are large, we can go ahead and delete objects we won't be using to conserve memory.
del refl
del wvl
Plot the filtered reflectance values using the same downtrack and crosstrack position as above.
ds['reflectance'].isel(downtrack=660,crosstrack=370).hvplot.line(y='reflectance',x='wavelengths', color='black')
Without the noisy data we can better interpret the spectral curve and hvplot will do a better job automatically scaling our axes.
We can also plot the data spatially. Find the band nearest the 850nm wavelength in the NIR, then plot the data spatially using the isel() function to select only that band and using hvplot.image() to view the reflectance at 850nm of each pixel across the acquired region.
b850 = np.nanargmin(abs(ds['wavelengths'].values-850)) # Find band nearest to value of 650 nm (red)
ds.isel(bands=b850).hvplot.image(cmap='viridis', aspect = 'equal', rasterize=True)
As previously mentioned a Geometry Lookup Table (GLT) is included in the location group of the netCDF4 file. Applying the GLT will orthorectify the image and give us Latitude and Longitude positional information.
Before orthocorrecting, examine the location group from the dataset by reading it into xarray.
loc = xr.open_dataset(fp,group='location')
loc
<xarray.Dataset>
Dimensions: (downtrack: 1280, crosstrack: 1242, ortho_y: 2009, ortho_x: 2353)
Dimensions without coordinates: downtrack, crosstrack, ortho_y, ortho_x
Data variables:
lat (downtrack, crosstrack) float64 ...
lon (downtrack, crosstrack) float64 ...
elev (downtrack, crosstrack) float64 ...
glt_x (ortho_y, ortho_x) float64 ...
glt_y (ortho_y, ortho_x) float64 ...We can see that each downtrack and crosstrack position has a latitude, longitude, and elevation, and the ortho_x and ortho_y data make up glt_x and glt_y arrays with a different shape. These arrays contain crosstrack and downtrack index values to quickly reproject the data. We will use these indexes to build an array of 2009x2353x285 (lat,lon,bands), filling it with the data from the EMIT dataset using the included emit_tools Python module. emit_tools contains some helpful functions for working with EMIT data using xarray.
Import the emit_tools module and call use the help function to see how it can be used.
Note: This function currently works with L1B Radiance and L2A Reflectance Data.
import sys
sys.path.append('../modules/')
from emit_tools import emit_xarray
help(emit_xarray)
Help on function emit_xarray in module emit_tools:
emit_xarray(filepath, ortho=True, qmask=None, unpacked_bmask=None, GLT_NODATA_VALUE=0, fill_value=-9999)
This function utilizes other functions in this module to streamline opening an EMIT dataset as an xarray.Dataset.
Parameters:
filepath: a filepath to an EMIT netCDF file
ortho: True or False, whether to orthorectify the dataset or leave in crosstrack/downtrack coordinates.
qmask: a numpy array output from the quality_mask function used to mask pixels based on quality flags selected in that function. Any non-orthorectified array with the proper crosstrack and downtrack dimensions can also be used.
unpacked_bmask: a numpy array from the band_mask function that can be used to mask band-specific pixels that have been interpolated.
GLT_NODATA_VALUE: no data value for the GLT tables, 0 by default
fill_value: the fill value for EMIT datasets, -9999 by default
Returns:
out_xr: an xarray.Dataset constructed based on the parameters provided.
We can see that the emit_xarray function will automatically apply the GLT to orthorectify the data unless ortho = False. The function will also apply masks if desired during construction of the output xarray.Dataset.
Use the emit_xarray function to read in and orthorectify the L2A reflectance data.
For a detailed walkthrough of the orthorectification process using the GLT see section 2 of the How_to_Orthorectify.ipynb in the how-tos folder.
ds_geo = emit_xarray(fp, ortho=True)
ds_geo
<xarray.Dataset>
Dimensions: (latitude: 2009, longitude: 2353, bands: 285)
Coordinates:
* latitude (latitude) float64 -39.31 -39.31 -39.31 ... -40.39 -40.4 -40.4
* longitude (longitude) float64 -62.51 -62.51 -62.51 ... -61.24 -61.24
wavelengths (bands) float32 381.0 388.4 395.8 ... 2.486e+03 2.493e+03
fwhm (bands) float32 8.415 8.415 8.415 8.415 ... 8.806 8.807 8.809
spatial_ref int64 0
Dimensions without coordinates: bands
Data variables:
reflectance (latitude, longitude, bands) float32 nan nan nan ... nan nan
Attributes: (12/38)
ncei_template_version: NCEI_NetCDF_Swath_Template_v2.0
summary: The Earth Surface Mineral Dust Source ...
keywords: Imaging Spectroscopy, minerals, EMIT, ...
Conventions: CF-1.63
sensor: EMIT (Earth Surface Mineral Dust Sourc...
instrument: EMIT
... ...
southernmost_latitude: -40.39610428069674
spatialResolution: 0.000542232520256367
spatial_ref: GEOGCS["WGS 84",DATUM["WGS_1984",SPHER...
geotransform: [-6.25120945e+01 5.42232520e-04 -0.00...
day_night_flag: Day
title: EMIT L2A Surface Reflectance 60 m V001Now that the data has been orthorectified, plot the georeferenced dataset using the same single wavelength (850nm) as above. We can use the aspect = 'equal' option to preserve the square pixel dimensions. The rasterize = True will help save memory and reduces the size of this notebook. For higher quality outputs, this can be omitted.
ds_geo.isel(bands=b850).hvplot.image(cmap='viridis', frame_width=500, aspect = 'equal', rasterize=True)
We an also plot the data against an imagery tile using the geo=True and tiles= parameters instead of . Any tile source available in geoviews should work here. This will change the axis names, but that can be fixed by adding them manually in the options, like below.
ds_geo.isel(bands=b850).hvplot.image(cmap='viridis', frame_width=500, geo=True, tiles='EsriImagery',rasterize=True).opts(
xlabel=f'{ds_geo.longitude.long_name} ({ds_geo.longitude.units})', ylabel=f'{ds_geo.latitude.long_name} ({ds_geo.latitude.units})')
We can see that the orthorectification step rotated the image and placed it on a Lat/Lon grid. Now that we have a better idea of what the target area looks like, we can also plot the spectra using the georeferenced data. First, filter out the water absorption bands using the good_refl() function we wrote in section 1.3.
ds_geo = ds_geo.where(ds_geo['reflectance']!=-0.01)
Now, plot the spectra at the Lat/Lon coordinates provided below.
ds_geo.sel(longitude=-61.833,latitude=-39.710,method='nearest').hvplot.line(y='reflectance',x='wavelengths', color='black', frame_width=400)
At this point, the ds_geo orthorectified EMIT data can also be written as a flattened netCDF4 output that can be read using the xarray.open_dataset function, if desired.
ds_geo.to_netcdf('../data/geo_ds_out.nc')
# Example for Opening
# ds = xr.open_dataset('../data/geo_ds_out.nc')
Combining the Spatial and Spectral information into a single visualization can be a powerful tool for exploring and inspecting data quality. Using the Streams function of Holoviews we can link a spatial map to a plot of spectra.
We could plot a single band image as we previously have, but using an RGB image may help infer what targets we're examining. Build an RGB image following the steps below.
Select bands to represent red, green, and blue by finding the nearest to a chosen wavelength.
# Find Nearest Bands
b650 = np.nanargmin(abs(ds_geo['wavelengths'].values-650)) # Find band nearest to value of 650 nm (red)
b560 = np.nanargmin(abs(ds_geo['wavelengths'].values-560)) # Find band nearest to value of 560 nm (green)
b470 = np.nanargmin(abs(ds_geo['wavelengths'].values-470)) # Find band nearest to value of 470 nm (blue)
Next, write a function to build an array from the chosen bands and scale the values using a gamma correction. Without applying this scaling the majority of the image would be very dark, with the reflectance data being skewed by the few pixels with very high reflectance.
Note: This has no impact on analysis or data, just visualizing the RGB map.
def gamma_adjust(ds,band):
# Define Array
array = ds['reflectance'].isel(bands=band).values
# Rescale Values using gamma to adjust brightness
gamma = math.log(0.2)/math.log(np.nanmean(array)) # Create exponent for gamma scaling - can be adjusted by changing 0.2
scaled = np.power(array,gamma).clip(0,1) # Apply scaling and clip to 0-1 range
scaled = np.nan_to_num(scaled, nan = 1) #Assign NA's to 1 so they appear white in plots
return scaled
Now apply this function to each of the selected bands, stack them, build the arrays of coordinates (Lat, Lon, Bands) needed to create an xarray.Dataset, then build the dataset.
# Scale the Bands
r = gamma_adjust(ds_geo,b650)
g = gamma_adjust(ds_geo,b560)
b = gamma_adjust(ds_geo,b470)
# Stack Bands and make an index
rgb = np.stack([r,g,b]) # Stack r,g,b arrays and assign NA's to 1 so they appear white in plots
bds = np.array([0,1,2])
# Pull lat and lon values from geocorrected arrays
x = ds_geo['longitude'].values
y = ds_geo['latitude'].values
# Create new rgb xarray data array.
data_vars = {'RGB':(['bands','latitude','longitude'], rgb)}
coords = {'bands':(['bands'],bds), 'latitude':(['latitude'],y), 'longitude':(['longitude'],x)}
attrs = ds_geo.attrs
ds_rgb = xr.Dataset(data_vars=data_vars, coords=coords, attrs=attrs)
ds_rgb.coords['latitude'].attrs = ds_geo['longitude'].attrs
ds_rgb.coords['longitude'].attrs = ds_geo['latitude'].attrs
ds_rgb
<xarray.Dataset>
Dimensions: (bands: 3, latitude: 2009, longitude: 2353)
Coordinates:
* bands (bands) int64 0 1 2
* latitude (latitude) float64 -39.31 -39.31 -39.31 ... -40.39 -40.4 -40.4
* longitude (longitude) float64 -62.51 -62.51 -62.51 ... -61.24 -61.24 -61.24
Data variables:
RGB (bands, latitude, longitude) float32 1.0 1.0 1.0 ... 1.0 1.0 1.0
Attributes: (12/38)
ncei_template_version: NCEI_NetCDF_Swath_Template_v2.0
summary: The Earth Surface Mineral Dust Source ...
keywords: Imaging Spectroscopy, minerals, EMIT, ...
Conventions: CF-1.63
sensor: EMIT (Earth Surface Mineral Dust Sourc...
instrument: EMIT
... ...
southernmost_latitude: -40.39610428069674
spatialResolution: 0.000542232520256367
spatial_ref: GEOGCS["WGS 84",DATUM["WGS_1984",SPHER...
geotransform: [-6.25120945e+01 5.42232520e-04 -0.00...
day_night_flag: Day
title: EMIT L2A Surface Reflectance 60 m V001Lastly, use the RGB data array to build a map object with the hvplot.rgb() function from holoviews.
# Define RGB Image
map = ds_rgb.hvplot.rgb(x='longitude', y='latitude', bands='bands', aspect = 'equal', frame_width=400)
To visualize the spectral and spatial data side-by-side, we use the pointerXY and Tap features of the streams functionality from the holoviews library. First, define objects resulting from the stream of the pointer x and y position on a spatial plot, then define objects resulting from a clicked x and y position on a spatial plot.
Next, define a function to plot the spectra based on these two sets of x and y coordinates on the map. This will allow us to return spectra from a position we clicked on the image, and spectra where the mouse is currently hovering, allowing comparison of pixel reflectance values.
# Stream of X and Y positional data
posxy = hv.streams.PointerXY(source=map, x=-61.833, y=-39.710)
clickxy = hv.streams.Tap(source=map, x=-61.833, y=-39.710)
# Function to build a new spectral plot based on mouse hover positional information retrieved from the RGB image using our full reflectance dataset
def point_spectra(x,y):
return ds_geo.sel(longitude=x,latitude=y,method='nearest').hvplot.line(y='reflectance',x='wavelengths',
color='#1b9e77', frame_width=400)
# Function to build spectral plot of clicked location to show on hover stream plot
def click_spectra(x,y):
return ds_geo.sel(longitude=x,latitude=y,method='nearest').hvplot.line(y='reflectance',x='wavelengths',
color='#d95f02', frame_width=400)
# Define the Dynamic Maps
point_dmap = hv.DynamicMap(point_spectra, streams=[posxy])
click_dmap = hv.DynamicMap(click_spectra, streams=[clickxy])
# Plot the Map and Dynamic Map side by side
(map + click_dmap*point_dmap)